A computational-graph partitioning method for training memory-constrained DNNs
نویسندگان
چکیده
Many state-of-the-art Deep Neural Networks (DNNs) have substantial memory requirements. Limited device becomes a bottleneck when training those models. We propose ParDNN, an automatic, generic, and non-intrusive partitioning strategy for DNNs that are represented as computational graphs. ParDNN decides placement of DNN’s underlying graph operations across multiple devices so the devices’ constraints met time is minimized. completely independent deep learning aspects DNN. It requires no modification neither at model nor systems level implementation its operation kernels. partitions having billions parameters hundreds thousands in seconds to few minutes. Our experiments with TensorFlow on 16 GPUs demonstrate efficient 5 very large models while achieving superlinear scaling both batch size throughput. either outperforms or qualitatively improves upon related work.
منابع مشابه
In-Place Activated BatchNorm for Memory-Optimized Training of DNNs
In this work we present In-Place Activated Batch Normalization (INPLACE-ABN) – a novel approach to drastically reduce the training memory footprint of modern deep neural networks in a computationally efficient way. Our solution substitutes the conventionally used succession of BatchNorm + Activation layers with a single plugin layer, hence avoiding invasive framework surgery while providing str...
متن کاملSize - constrained graph partitioning polytope
We consider the problem of clustering a set of items into subsets whose sizes are bounded from above and below. We formulate the problem as a graph partitioning problem and propose an integer programming model for solving it. This formulation generalizes several well-known graph partitioning problems from the literature like the clique partitioning problem, the equi-partition problem and the k-...
متن کاملA computational study of graph partitioning
Let G = (N, E) be an edge-weightedundirected graph. The graph partitioning problem is the problem of partitioning the node set N into k disjoint subsets of specified sizes so as to minimize the total weight of the edges connecting nodes in distinct subsets of the partition. We present a numerical study on the use of eigenvalue-based techniques to find upper and lower bounds for this problem. Re...
متن کاملA cover partitioning method for bound constrained global optimization
A stochastic algorithm for global optimization subject to simple bounds is described. The method is applicable to black-box functions which may be non-smooth or discontinuous. The algorithm is in the spirit of the deterministic algorithm direct of Jones, Perttunen, and Stuckman. Like direct, it generates successively finer covers of the feasible region, where each cover consists of a finite num...
متن کاملEdge Partitioning in External-Memory Graph Search
There is currently much interest in using external memory, such as disk storage, to scale up graph-search algorithms. Recent work shows that the local structure of a graph can be leveraged to substantially improve the efficiency of externalmemory graph search. This paper introduces a technique, called edge partitioning, which exploits a form of local structure that has not been considered in pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Parallel Computing
سال: 2021
ISSN: ['1872-7336', '0167-8191']
DOI: https://doi.org/10.1016/j.parco.2021.102792